← Back to Contents
Note: This page's design, presentation and content have been created and enhanced using Claude (Anthropic's AI assistant) to improve visual quality and educational experience.
Week 10 • Sub-Lesson 6

🎯 Hands-On Activities and Assessment

Same Task, Three Ways: test plain chat, chat-with-tools, and Deep Research on your own research question — then judge and verify the results, on free tools only

🎯 What We'll Cover

This is where the week becomes concrete. You have a working definition of agents (10.1), a taxonomy of how they fail (10.2), a map of the tools (10.3), an understanding of agentic RAG (10.4), and an honest free-tier guide (10.5). Now you apply all of it to a research question you actually care about.

The headline activity is “Same Task, Three Ways”: run one question through plain chat, chat-with-tools, and a Deep Research mode, then judge the results with the Week 9 failure taxonomy and verify them with the Week 5 citation checks. Two shorter activities follow, and the week's assessment pulls them together. Everything here is designed to be done entirely on free tools — on a phone or a borrowed laptop if that is what you have — because, as 10.5 argued, a course for this continent cannot assume a subscription.

One last echo of Week 9: every output you produce this week is a snapshot dated May 2026. Part of the assessment is acknowledging that — recording what was true when you did the work, and how soon you would expect it to change.

🧪 Activity 1: Same Task, Three Ways

The core activity. Choose a research question in your own field — specific, genuinely answerable, and not trivially Googleable. Something a knowledgeable colleague would have to think about. You will be the expert judge of the answers, which is the point: you can tell when an AI is wrong in your own area in a way you cannot in someone else's.

Run that one question through three modes, keeping everything else the same:

Then write a one-to-two-page comparison. Do not just say which was “best”. For each mode, record: how deep and specific the answer was; how good the citations were (and whether they exist — you will check in Activity 2); what it got wrong in your expert judgement; and where each mode failed. Then do the analytical core of the exercise:

📊 Apply the Week 9.2 taxonomy explicitly

For every failure you observed, classify it: was it patched (you were using a weak tool — a current one would not fail this way), reduced-but-persistent (a known weakness you can manage with better prompting or tool choice), or structural (something the next model release will not fix — long-tail gaps, compositional error, the reliability-not-accuracy problem from 10.2)? Then, for the structural ones, state what the Week 9.5 verification protocol would have you do about it. This is the muscle the whole activity exists to build.

🔍 Activity 2: Verify a Deep Research Output

This follows directly from Activity 1. Take the Deep Research report you generated and put its citations on trial, using the tools you already have:

Then report the numbers: of the citations the Deep Research tool gave you, how many checked out completely? How many pointed to real sources that said something different from the report's claim? How many were to “papers” that do not appear to exist at all? This is the Week 5 hallucinated-citation exercise carried into the agentic-RAG era — and the results are usually sobering, which is exactly the lesson. A fluent, well-formatted, confident research report is not a verified one.

🔌 Activity 3 (Optional): A Small MCP Workflow

For students who want to go further. Using one of Claude.ai's free connectors — the creative connectors, or (since April 2026) the read-only Microsoft 365 connector — wire up one small step of your research workflow, involving nothing personal or confidential. Write 250 words on what worked, what failed, and — most importantly — what you would never let it do unsupervised, and why. The point is not the connector; it is articulating your own permissions dial (10.1) for a real task.

🔒 A note on the Microsoft 365 connector

It needs a business or education Microsoft account (not a personal @outlook.com one), and your institution's IT must allow the connection. There is a simple way to find out whether yours does: just try adding it. If Microsoft shows an “administrator approval required” screen, self-service connection is not enabled for your institution and you would need IT to approve it. Reading that consent screen — and deciding whether you would even want to grant the access it asks for — is itself a useful exercise in the permissions thinking this week is about.

🌐 The Data-Disclosure Rule (Required, Graded)

Every tool in this activity processes your data outside South Africa — the US-based assistants and the Chinese ones alike (10.5). So a disclosure statement is part of the deliverable, not an optional extra. Two firm rules and a template:

📄 Disclosure statement template (copy, complete, submit)

“For this exercise I used: [tool 1], [tool 2], [tool 3].
Each processes data outside South Africa, in: [country/region per tool, e.g. United States / China / unknown].
The research question involved no identifiable personal information or third-party confidential material.
I verified the outputs as follows: [Five-Point Citation Check / dated-research check / other].
Outputs are accurate as of [date]; I would expect the tool capabilities and free-tier limits described to change within [estimate].”

📝 The Week 10 Assessment

The assessment is a single piece of roughly 1,500 words, in the same spirit as the Week 9 assessment: an explicitly dated snapshot that acknowledges its own coming obsolescence. Free tools only — a hard rule, so that everyone is judged on the same playing field regardless of what they can afford. Required sections:

Section What it contains
Tool comparison The Activity 1 three-way comparison, with concrete observations per mode.
Applied failure taxonomy Each observed failure classified patched / reduced / structural, with the Week 9.5 action for the structural ones.
Verification audit The Activity 2 citation check, with the numbers: how many citations held up, how many didn't.
Data-flow disclosure The completed disclosure statement and a sentence on the POPIA reasoning behind it.
Staleness reflection What is dated about your findings, and your recommended retest cadence.
“If I had a paid subscription” One honest paragraph on what you could not do for free, and whether it would have changed your conclusions.

🗺️ Week 10 in One Page

Pulling the week together:

The one idea to keep

Agents change what the tools can do. They do not change who is responsible for the result. Every capability in this week shifts work onto the machine and verification onto you — and the researcher who understands that trade, and keeps the verification, is the one who benefits from agents instead of being misled by them.